11 research outputs found
SEGCloud: Semantic Segmentation of 3D Point Clouds
3D semantic scene labeling is fundamental to agents operating in the real
world. In particular, labeling raw 3D point sets from sensors provides
fine-grained semantics. Recent works leverage the capabilities of Neural
Networks (NNs), but are limited to coarse voxel predictions and do not
explicitly enforce global consistency. We present SEGCloud, an end-to-end
framework to obtain 3D point-level segmentation that combines the advantages of
NNs, trilinear interpolation(TI) and fully connected Conditional Random Fields
(FC-CRF). Coarse voxel predictions from a 3D Fully Convolutional NN are
transferred back to the raw 3D points via trilinear interpolation. Then the
FC-CRF enforces global consistency and provides fine-grained semantics on the
points. We implement the latter as a differentiable Recurrent NN to allow joint
optimization. We evaluate the framework on two indoor and two outdoor 3D
datasets (NYU V2, S3DIS, KITTI, Semantic3D.net), and show performance
comparable or superior to the state-of-the-art on all datasets.Comment: Accepted as a spotlight at the International Conference of 3D Vision
(3DV 2017
Volumetric Semantically Consistent 3D Panoptic Mapping
We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at
generating comprehensive, accurate, and efficient semantic 3D maps suitable for
autonomous agents in unstructured environments. The proposed approach is based
on a Voxel-TSDF representation used in recent algorithms. It introduces novel
ways of integrating semantic prediction confidence during mapping, producing
semantic and instance-consistent 3D regions. Further improvements are achieved
by graph optimization-based semantic labeling and instance refinement. The
proposed method achieves accuracy superior to the state of the art on public
large-scale datasets, improving on a number of widely used metrics. We also
highlight a downfall in the evaluation of recent studies: using the ground
truth trajectory as input instead of a SLAM-estimated one substantially affects
the accuracy, creating a large gap between the reported results and the actual
performance on real-world data.Comment: 8 pages, 2 figure
Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature
Point cloud registration has seen recent success with several learning-based
methods that focus on correspondence matching and, as such, optimize only for
this objective. Following the learning step of correspondence matching, they
evaluate the estimated rigid transformation with a RANSAC-like framework. While
it is an indispensable component of these methods, it prevents a fully
end-to-end training, leaving the objective to minimize the pose error
nonserved. We present a novel solution, Q-REG, which utilizes rich geometric
information to estimate the rigid pose from a single correspondence. Q-REG
allows to formalize the robust estimation as an exhaustive search, hence
enabling end-to-end training that optimizes over both objectives of
correspondence matching and rigid pose estimation. We demonstrate in the
experiments that Q-REG is agnostic to the correspondence matching method and
provides consistent improvement both when used only in inference and in
end-to-end training. It sets a new state-of-the-art on the 3DMatch, KITTI, and
ModelNet benchmarks
SemDpray: Virtual reality as-is semantic information labeling tool for 3D spatial data
Capturing the as-is status of buildings in the form of 3D spatial data has been becoming more accurate and efficient, but the act of extracting from it as-is information has not seen similar advancements. State-of-the-art practice requires experts to manually interact with the spatial data in a laborious and time-consuming process. We propose Semantic Spray (Semspray), a Virtual Reality (VR) application that provides users with intuitive tools to produce semantic information on as-is 3D spatial data of buildings. The goal is to perform this task accurately and more efficiently by allowing users to interact with the data at different scales
HoloLabel: Augmented reality user-in-the-loop online annotation tool for as-is building information
3D sensing devices simplify the task of capturing and reconstructing an environment as a spatial 3D mesh. However, the task of converting this purely geometric information to a semantically meaningful as-is building model is non-trivial. State-of-the-art practice follows a first step of acquiring the spatial 3D mesh on site and subsequently resorts in manual semantic labeling in the office, where experts have to work for many hours using non-intuitive and error-prone tools. We develop HoloLabel, an Augmented Reality application that allows users to directly and on-site annotate a scene in 3D with rich semantic information while simultaneously capturing its spatial 3D mesh.ISSN:2684-115
ImpliCity: City Modeling From Satellite Images with Deep Implicit Occupancy Fields
High-resolution optical satellite sensors, combined with dense stereo algorithms, have made it possible to reconstruct 3D city models from space. However, these models are, in practice, rather noisy and tend to miss small geometric features that are clearly visible in the images. We argue that one reason for the limited quality may be a too early, heuristic reduction of the triangulated 3D point cloud to an explicit height field or surface mesh. To make full use of the point cloud and the underlying images, we introduce IMPLICITY, a neural representation of the 3D scene as an implicit, continuous occupancy field, driven by learned embeddings of the point cloud and a stereo pair of ortho-photos. We show that this representation enables the extraction of high-quality DSMs: with image resolution 0.5 m, IMPLICITY reaches a median height error of ≈0.7m and outperforms competing methods, especially w.r.t. building reconstruction, featuring intricate roof details, smooth surfaces, and straight, regular outlines.ISSN:2194-9042ISSN:2194-905